-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
migration,*: onboard TruncatedAndRangeAppliedStateMigration
#57694
Closed
irfansharif
wants to merge
12
commits into
cockroachdb:master
from
irfansharif:201207.truncated-state-migration
Closed
migration,*: onboard TruncatedAndRangeAppliedStateMigration
#57694
irfansharif
wants to merge
12
commits into
cockroachdb:master
from
irfansharif:201207.truncated-state-migration
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Release note: None
Makes for better logging. Release note: None
We can separate out the `Helper`, `Migration`, and various utilities into their own files. We'll add tests for individual components in future commits; the physical separation here sets the foundation for doing so (prototyped in cockroachdb#57445). This commit is purely code movement. Release note: None
It's clearer to talk explicitly in terms of causality. Release note: None
We re-define what the Migration type is to be able to annotate it a description. We'll later use this description when populating the `system.migrations` table (originally prototyped in cockroachdb#57445). Release note: None
We make it a bit more ergonomic (this revision was originally prototyped in cockroachdb#57445). Release note: None
To facilitate testing `Helper` in isolation, we introduce a `cluster` interface that we'll mock out in tests. It's through this interface that the migration infrastructure will able to dial out to a specific node, grab hold of a kv.DB instance, and retrieve the current cluster membership. Part of diff also downgrades `RequiredNodes` from being a first class primitive, instead tucking it away for internal usage only. Given retrieving the cluster membership made no guarantees about new nodes being added to the cluster, it's entirely possible for that to happen concurrently with it. Appropriate usage then entailed wrapping it under a stabilizing loop, like we do so in `EveryNode`. This tells us there's no need to expose it directly to migration authors. Release note: None
Release note: None
It's not currently wired up to anything. We'll use it in future PRs to send out `Migrate` requests to the entire keyspace. This was originally prototyped in cockroachdb#57445. See the inline comments and the RFC (cockroachdb#48843) for the motivation here. Release note: None
irfansharif
changed the title
migration.*: onboard
migration,*: onboard Dec 8, 2020
TruncatedAndRangeAppliedStateMigration
TruncatedAndRangeAppliedStateMigration
irfansharif
force-pushed
the
201207.truncated-state-migration
branch
from
December 8, 2020 03:31
6c0acc7
to
95fa479
Compare
This command forces all ranges overlapping with the request spans to execute the (below-raft) migrations corresponding to the specific, stated version. This has the effect of moving the range out of any legacy modes operation they may currently be in. KV waits for this command to durably apply on all the replicas before returning, guaranteeing to the caller that all pre-migration state has been completely purged from the system. We're currently not wiring it up to anything. We will in a future commit that introduces the truncated state migration. This commit was pulled out of our prototype in cockroachdb#57445. Release note: None
We introduce two new RPCs for the migration infrastructure to use. `GCReplicas` will be used to instruct the target node to process all GC-able replicas. `FlushAllEngines` will be used to instruct the target node to persist all in-memory state to disk. Both of these are necessary primitives for the migration infastructure. Specifically this comes up in the context of wanting the ensure that ranges where we've executed a ranged `Migrate` command over have no way of ever surfacing pre-migrated state. This can happen with older replicas in the replica GC queue and with applied state that is not yet persisted. We elaborate on both of these below: Motivation for `GCReplicas`: Currently we wait for the `Migrate` to have applied on all replicas of a range before returning to the caller. This does not include earlier incarnations of the range, possibly sitting idle in the replica GC queue. These replicas can still request leases, and go through the request evaluation paths, possibly tripping up assertions that check to see no pre-migrated state is found. For this reason we introduce the `GCReplicas` RPC that the migration manager can use to ensure all GC-able replicas are processed before declaring the specific cluster version bump complete. Motivation for `FlushAllEngines`: Like we mentioned above, KV currently waits for the `Migrate` command to have applied on all replicas before returning. With the applied state, there's no necessity to durably persist it (the representative version is already stored in the raft log). Out of an abundance of caution, and to really really ensure that no pre-migrated state is ever seen in the system, we provide the migration manager a mechanism to flush out all in-memory state to disk. This way the manager can guarantee that by the time a specific cluster version is bumped, all pre-migrated state from prior to that cluster version will have been fully purged from the system. --- The ideas here follow from our original prototype in cockroachdb#57445. Neither of these RPCs are currently wired up to anything. That'll happen in a future commit introducing the raft truncated state migration. Release note: None
irfansharif
force-pushed
the
201207.truncated-state-migration
branch
from
December 8, 2020 03:37
95fa479
to
36b37f3
Compare
This PR onboards the first real long-running migration using the infrastructure we've been building up within pkg/migration. It was originally prototyped in cockroachdb#57445. TruncatedAndRangeAppliedStateMigration (which also happens to be a below-Raft one) lets us do the following: i. Use the RangeAppliedState on all ranges ii. Use the unreplicated TruncatedState on all ranges In 21.2 we'll finally be able to delete holdover code that knows how to handle the legacy replicated truncated state. Release note(general change): Cluster version upgrades, as initiated by SET CLUSTER SETTING version = <major>-<minor>, now perform internal maintenance duties that will delay how long it takes for the command to complete. The delay is proportional to the amount of data currently stored in the cluster. The cluster will also experience a small amount of additional load during this period while the upgrade is being finalized. Release note: None
irfansharif
force-pushed
the
201207.truncated-state-migration
branch
from
December 8, 2020 04:57
36b37f3
to
ef96beb
Compare
Intentionally holding off on the review here until the building blocks settle, but the broad strokes here look good. |
Superceded by #58088. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR onboards the first real long-running migration using the
infrastructure we've been building up within pkg/migration. It was
originally prototyped in #57445.
TruncatedAndRangeAppliedStateMigration (which also happens to be a
below-Raft one) lets us do the following:
i. Use the RangeAppliedState on all ranges
ii. Use the unreplicated TruncatedState on all ranges
In 21.2 we'll finally be able to delete holdover code that knows how to
handle the legacy replicated truncated state.
Release note(general change): Cluster version upgrades, as initiated by
SET CLUSTER SETTING version = <major>-<minor>
, now perform internalmaintenance duties that will delay how long it takes for the command to
complete. The delay is proportional to the amount of data currently
stored in the cluster. The cluster will also experience a small amount
of additional load during this period while the upgrade is being
finalized.
Release note: None
Only the last commit here is of interest. All prior commits are from #57662,
#57650 and #57366.